A Statistical Analysis of On-Off 
Patterns in 16 Conversations 

By PAUL T. BRADY 

(Manuscript received September 14, 1967) 

This is a summary of data jrom an extensive analysis of on-off speech 
patterns in 16 experimental telephone conversations. The on-off patterns 
are determined by a fixed threshold speech detector having certain rules 
for rejecting noise and for filling in short gaps (for example, from stop 
consonants). Distributions are obtained for ten events, including talk- 
spurts, pauses, double talking (simultaneous speech from both parties), 
mutual silence, etc. Particular emphasis is placed on events surrounding 
interruptions. The entire analysis is performed for three speech detector 
thresholds, since most of the data are strongly influenced by choice of 
threshold. Observations are made about the influence of threshold on the 
data, properties of speech invariant with choice of threshold, and differences 
between male and female speech patterns. 

I. INTRODUCTION 

A statistical analysis of the on-off speech patterns of 16 recorded 
conversations has been obtained by a computer program written by 
Mrs. N. W. Shrimpton in 1963, and recently modified by the author. 
The data can serve the following purposes. 

(i) They can illustrate the effect of variation of threshold setting 
on the resulting speech data. This problem has been plaguing virtually 
all researchers who have attempted to arrive at the "basic" talkspurt- 
pause patterns, that is, patterns which represent the subjective on-off 
behavior, either as intended by the speaker or as perceived by the 
listener. (There is, of course, no certainty that such on-off classifica- 
tion actually occurs during normal talking and listening.) 

(u) They can guide the design of voice operated devices, such as 
conventional echo suppressors 1 or an adaptive transversal filter echo 
canceller, 2 both of which have critical timing problems in the inter- 
vals surrounding interruptions. 
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{Hi) They can provide material for building stochastic models of 
speech patterns in conversations. Several studies have already used 
basic models (such as Markov processes) to approximate talkspurt 
and pause durations. 3 - 4 These models could be useful in predicting 
conversational behavior over special circuits, such as those containing 
transmission delay. 

Applicability of the data to the above-mentioned purposes is influ- 
enced by the source and nature of the speech material and by the 
speech detector used to obtain on-off patterns. Section II is a descrip- 
tion of the speech material, and Section III contains a description 
of the speech detector. This detector tries to yield patterns as close 
as possible to the original waveform, while making certain correc- 
tions to make the pattern representative of perceived speech patterns. 
These corrections, requiring two arbitrary parameters, include rejec- 
tion of impulse noise operation and bridging of gaps caused by stop 
consonants. The third parameter, threshold, has such an effect on the 
data that the analysis is performed for a range of thresholds. In other 
respects however, the detector preserves fine details of timing of 
events; for example, the attack and release times are less than 5 msec. 

Characterization of speech for speech detectors in the telephone 
system is a different problem from characterization of speech for 
modeling conversational speech patterns. The data of the present 
study are not intended to provide a basis for characterizing speech 
for telephone system speech detectors. Within the constraints of the 
corrections described in the preceding paragraph, however, the data 
can be extended to predict the behavior of certain speech detectors 
as explained in Section 5.2. 

II. THE CONVERSATIONS 

2.1 Source 

Of the 16 conversations, eight, obtained from four male pairs and 
four female pairs, lasted about 7 minutes each and were documented 
in a previous paper. 5 The remaining eight, also four male and four 
female pairs, lasted about 10 minutes each. The subjects talked over 
a 4-wire circuit such as illustrated in Fig. 1. The losses were typical 
of a long distance call, and there were no degrading factors such as 
noise, echo, or delay. The voices were recorded at the zero transmis- 
sion level points (0 TLP), determined to be 6 dB "away from" the 
transmitters. The TLP is an arbitrary reference level used to es- 
tablish relative levels in a telephone circuit.) 
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Fig. 1 — Circuit over which subjects talked. 

The members of each pair were close friends; we have found that 
conversation between strangers can be restrained and halting. Their 
instructions were as follows. 

"Your task in this experiment will be to converse with each other 
for approximately 10 minutes. You may talk about anything you 
wish, but keep in mind that you will be recorded. The recording will 
be kept private and will be used for computer analysis of speech. 
We ask that you both talk frequently; if only one person talks the 
conversation will be of almost no value to us." 

This method seemed to produce natural conversational speech which 
was not restrained by the subjects' knowledge that they were being 
recorded. 



2.2 Scope of the Conservations 

Since the experimental conversations do not represent a random 
sample of calls in a telephone office, they cannot provide documenta- 
tion of speech patterns on subscriber circuits. They are, however, of 
interest in their own right and even possess advantages over customer 
calls. 

(i) The experimental calls are recorded, and can be studied for 
contextual material, etc. 

(u) The subjects are indeed conversing, rather than momentarily 
setting the phone down, or even switching off to other persons. In 
short, in the experimental calls, the subjects and tasks are known. 

(Hi) Interest in transmission work is often centered on those parts 
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of a call with active interchange, as our experimental calls generally- 
had. This is especially true in echo suppressor 1 and speakerphone 
studies. 

III. THE SPEECH DETECTOR 

The technique of obtaining on-off speech patterns, although already 
documented, 8 is summarized as follows. A flip-flop is set any time 
speech (full-wave rectified and unfiltered) from speaker A crosses a 
threshold. This flip-flop is examined and cleared every 5 milliseconds, 
with the output being a 1 if the threshold was crossed, otherwise. 
The resulting string of Is (spurts) and Os (gaps) is examined for short 
spurts; all spurts ^15 msec* are erased. After this is done, all gaps 
^200 msec are filled in to account for momentary interruptions, such 
as those due to stop consonants. The resulting on-off pattern consists, 
by definition used here, of talkspurts and pauses. An identical procedure 
is used for speaker B. 

Three thresholds have been chosen: — 45 dBmOf (most sensitive), 
— 40, and —35. These values seemed to bracket the range between 
excessive noise operation and insufficient speech operation. The average 
peak level (apl)J for all 32 speakers was -18.9 dBm re TLP, 26.1 dB 
above the most sensitive threshold. If one prefers VUs, a previous 
study 7 showed that VUs obtained by Miss K. L. McAdoo (an ex- 
perienced VU meter reader) are roughly 6 dB below the apis, hence, 
the average VU for that observer would have been near — 25 dBm. 

IV. DATA 

4.i Approximation in the Medians 

Although all means reported here are exact (in that they equal the 
total time in an event divided by the number of event occurrences) 
the medians are not exact because the measuring intervals are arbitrarily 
categorized. For example, the median talkspurt at the —45 dBm 
threshold is somewhere between 750 and 800 msec, and is reported 
at 775 msec, the interval midpoint. The measuring intervals are roughly 
proportional to the lengths of events; the intervals are as short as 
10 msec for events ^200 msec and as long as 1 second for events 
> 6 seconds. 



* This was originally 10 msec, but 15 msec seems to be required for good im- 
pulse noise rejection. 

t— 45 dBm measured at the TLP. 

t Average peak level is a measure of speech level based on the average log 
rectified speech voltage. 7 
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4.2 Percent oj Time Spent in Different States 

Table I shows three measures made per person or per conversa- 
tion: 

(i) Percent of time each person talked, averaged over 32 persons, 
obtained for each person by dividing his total speech time by the 
length of his conversation. 

(ii) Percent of time in double-talking, averaged over 16 conver- 
sations. 

(iii) Percent of time in mutual silence, averaged over 16 conver- 
sations. 

Table II shows two measures made on the entire sample of 137.4 
minutes of conversation: the percent of time in double talking, and 
the percent of time in mutual silence. Notice that mutual silence is 
the complement of the event that one or both speakers are talking. 

4.3 Categorized Events 

Ten events were defined and measured. Figs. 2 through 11 are cumu- 
lative distribution plots of the events. The arrows show which event 
is being measured. For example, in Fig. 2, which shows the talkspurt 
cumulative distribution, there are three events illustrated and indi- 
cated by the arrows. 

The defined events are : 

(i) Talkspurt — defined in Section III. 

(h) Pause — defined in Section III. 

(Hi) Double talk — a time when speech is present from both A 
and B. 

(iv) Mutual silence — a time when silence is present from both 
A and B 



Table I — Percent of Time in Different States* 



State 



-45 dBm 
Mean 



Talking 

(per person) 43.53 

Double talking 

(per conversation) 6.58 

Mutual silence 

(per conversation) 18.97 



9.10 
3.47 
6.55 



-40 dBm 
Mean 



39.5 

4.49 
25.01 



8.37 
2.41 

7.28 



-35 dBm 

Mean 



35.00 

3.10 

32.55 



8.31 
1.81 
9.77 



* Average of 32 persons or 16 conversations. 
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Table II — Percent of Time in States for Entire Sample* 



State 


-45 dBm 


-40 dBm 


-35 dBm 


Double talking 
Mutual silence 


6.78 
19.07 


4.62 
24.99 


3.22 
32.37 



* 137.4 minutes, including all 16 conversations. 

(v) Alternation silence — the period of mutual silence between the 
end of one speaker's talkspurt and the beginning of the other's. Event 
5 is a subset of 4. If a speaker alternation results from an interrup- 
tion so that there is no mutual silence period, then an alternation 
silence has not occurred. (There are no negative alternation silences.) 

(vi) Pause in isolation — a pause in which the other speaker is silent 
throughout the pause. Event 6 is a subset of both 2 and 4. 

(vii) Solitary talkspurt — a talkspurt which occurs entirely within 
the other speaker's silence. Event 7 is a subset of 1. 

(viii) Interruption — if A interrupts B, the time at which A's talk- 
spurt begins determines the start of an interruption. The interruption 
terminates at the end of A's talkspurt, unless B stops and then inter- 
rupts A, in which case .A's interruption terminates upon B's counter 
interruption. 

(ix) Speech after interruption — if A interrupts B, the remainder 
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Fig. 2 — Talkspurts for 32 speakers. 
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Fig. 4 — Doubletalk for 16 conversations. 
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Fig. 5 — Mutual silence for 16 conversations. 
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Fig. 7 — Pauses in isolation for 32 subjects. Events from A and B have been 
combined. 
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Fig. 8 — Solitary talkspurts for 32 speakers. Events from A and B have been 
combined. 
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Fig. 9 — Interruptions for 32 speakers. Events from A and B have been com- 
bined. 
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Fig. 11 — Speech before interruption. Events from A and B have been com- 
bined. 

of J5's talkspurt is entered here, unless A pauses and then again inter- 
rupts the same B talkspurt. The first "speech after interruption" 
would terminate upon ^4's reinterruption, and a second speech after 
interruption would begin. 

(x) Speech before interruption — if A interrupts B, J5's speech in- 
terval up to the interruption is entered here. If A then pauses at time 
h and reinterrupts at time t 2 , (assuming B continues talking) , a new 
B speech before interruption (^-^1) is entered. If A continues talking 
and B pauses and then counter interrupts, the length of B's pause is 
entered as A's speech before interruption. 

4.4 Tabulated Events for the Entire Sample 

Table III lists the mean, median, and number of events for talk- 
spurts, pauses, and doubletalks of the entire 137.4-minute conversa- 
tion sample. Notice that the talkspurts and pauses represent 274.8 
minutes of speech, since the A and B speech samples can be sepa- 
rated and placed end to end. 

4.5 Means and Sigmas for Averages of Events 

Table IV lists the means of the averages of the categorized events 
per person (or in some cases per conversation). For example, the 



84 



THE BELL SYSTEM TECHNICAL JOURNAL, JANUARY 19G8 





°| 


-* 


■* 


.— < 




CN 


CN 


CN 




J 3 ' > 


"* 


■* 


*— I 




fcw 


CO 


CO 


l-H 




Sd 










m a 


>o 


IO 


IO 




t3t3 


t^ 


I~- 


CO 




.« © 


»ra 


CD 


f— t 




7 s 










-So 


c 


d 






c 










o 










u 










V 






* 




CO 






s 


8 


"""'©j 


<# 


IO 


J 


o 


CD 


CO 


a. 


S 


o> 


CD 


CN 


S 




d 


iH 


d 


-«J 










02 










H 










cd 










•-i 










Eh 

Z 


°d 




CI 

1- 


i— i 


rt 










o 










fa 










CO 










H 


a 








I 


« d 


»o 


«3 


>o 


73 rt 


t^ 


CI 


co 


s 


§3 

IS 


^CO 


1^ 


l-H 


w 


C 

o 


© 


d 


X 












o 






5 












d 


X— '«n 


_H 


CN 


n 

9 


CN 


Ol 


CO 




I- 


CN 


O 


S 










l-H 


. — ' 


d 


fa 





















03 










& 










o 










■< 


■g| 


CO 

CO 


JO 


o 

co 


6 S 


•* 
io 


3 


CO 

l-H 


•— 










Q 


g 








i— i 
i— < 


ffld 

-a as 


IO 

I- 


10 

Ol 


CD 


l-H 


IS 


t^ 


(^ 


CN 




■8© 


d 


d 


3 




o 






H 




0J 
BO 











T-H 


IO 


CO 




S 


>— I 


— . 


OS 




s 


CO 


cc 


CN 




•— 1 


" 


d 
















1? 


'fl 


M 'S 




*3 


■p'3 


*H 


Cj'3 




d 


— C 




— - 




m 


3 








H 


ft°° 


CO 


_4>tF 




MS 


s^' 


Jn 






s 1 ^ 


3CO 






13 <N 


as 


OH 






H 


Q w 



ANALYSIS OF ON-OFF PATTERNS 



85 



mean talkspurt length of 1.366 second for a —45 dBm threshold is 
the average of 32 numbers, each in turn being the average talkspurt 
length for a particular speaker. The a reported* is the standard devia- 
tion of the 32 (or 16) averages among speakers (or conversations). 



Table IV — Means and Standard Deviations of the 
Averages of Events* 





-45 dBm 


-40 dBm 


—35 dBm 


Event (Jfc) 


Mean 


a 


Mean 


a 


Mean 


a 




(seconds) 


(seconds) 


(seconds) 


Talkspurt (32) 


1.366 


0.442 


1.197 


0.444 


0.980 


0.425 


71 ~ 181 1 














Pause (32) 


1.802 


0.639 


1.846 


0.648 


1.742 


0.663 


n~ 181 














Double talk (16) 


0.280 


0.061 


0.251 


0.055 


0.223 


0.058 


n ~90 














Mutual silence (16) 


0.425 


0.08S 


0.466 


0.088 


0.495 


0.080 


n ~ 271 














Alternation silence (32) 


0.345 


0.104 


0.397 


0.116 


0.456 


0.126 


n ~ 58 














Pause in isolation (32) 


0.488 


0.093 


0.502 


0.092 


0.512 


0.091 


n~77 














Solitary talkspurt (32) 


1.359 


0.503 


1.173 


0.453 


0.955 


0.422 


n ~ 106 














Interruption (32) 


0.792 


0.266 


0.742 


0.303 


0.695 


0.354 


n ~45 














Speech after inter- 














ruption (32) 


0.867 


0.366 


0.775 


0.336 


0.650 


0.277 


n ~45 














Speech before inter- 














ruption (32) 


0.895 


0.282 


0.831 


0.358 


0.673 


0.316 


n ~45 















* Per person (k = 32) or per conversation (k = 16). 

t Values of n were obtained by dividing the total number of events for —40 dBm 
threshold by k. These numbers thus give a rough idea of the frequency of these events 
per person (or conversation). These same values also apply to Table V. 



4.G Means and Sigmas jor Medians oj Events 

Table V lists the means of the medians of the categorized events 
per person (or conversation). For example, the "average of median" 
talkspurt length of 0.788 second for a —45 dBm threshold is the 
average of 32 talkspurt medians, with 0.229 second as the standard 
deviation of the 32 medians among speakers. 



* a = [(n/n — 1) X (sample variance) ] 1/z through this paper. 
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4.7 Male vs Female 

Table VI lists the means of the averages of the events per person 
or conversation, for the men and women separately. (Data on aver- 
ages of the medians are available from the author.) 

4.8 Transitional Probabilities 

As we mentioned, some researchers have postulated first-order 
Markov processes to model speech patterns. A conversation, at any- 
instant, can exist in one of four states depending on who is talking: 



Table V — Means and Standard Deviations of the 
Medians of Events* 





-46dBm 


-40dBm 


-36 dB 


m 


Event (k) 


Mean 


a 


Mean 


a 


Mean 


a 




(seconds) 


(seconds) 


(seconds) 


Talkspurt (32) 


0.788 


0.229 


0.756 


0.250 


0.652 


0.307 


Pause (32) 


0.759 


0.184 


0.779 


0.193 


0.706 


0.195 


Double talk (16) 


0.199 


0.048 


0.181 


0.046 


0.153 


0.047 


Mutual silence (16) 


0.332 


0.056 


0.366 


0.056 


0.378 


0.045 


Alternation silence (32) 


0.264 


0.082 


0.312 


0.101 


0.347 


0.096 


Pause in isolation (32) 


0.397 


0.091 


0.389 


0.079 


0.384 


0.069 


Solitary talkspurt (32) 


0.890 


0.326 


0.799 


0.325 


0.660 


0.344 


Interruption (32) 


0.418 


0.218 


0.405 


0.218 


0.387 


0.233 


Speech after inter- 














ruption (32) 


0.487 


0.156 


0.410 


0.144 


0.383 


0.166 


Speech before 














interruption (32) 


0.503 


0.160 


0.439 


0.151 


0.346 


0.148 



* Per person (k = 32) or per conversation (k = 16). 

neither, A, B, or both. If the conversation is in state i (i = 1,2,3,4) at 
some time t, it may be of interest to know the probability of being 
in state ; (; = 1,2,3,4) at t + A£, possibly to establish a crude Markov- 
ian model for distributions of times in each state. Notice, however, 
that the simplest Markovian model will predict that each state will 
have an exponential distribution, which is a hypothesis not generally 
supported by the data. (For example, mutual silence is the event rep- 
resenting the first state, and a glance at Fig. 5 shows that this dis- 
tribution is strongly colored by the 200 msec fill-in time.) 

This paper is primarily a collection of data, and is not intended 
to pursue the problem of modeling conversational behavior. We shall 
therefore simply list the transition matrix for the —40 dBm threshold* 

* Transition probabilities for the other thresholds may be obtained from the 
author. 
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Table VI — Means of Averages of Events 



Event 


Threshold 
(dBm) 


Male 


Female 


Signif. 
Level* 


M -40t 
F -45 


M -35 

F -40 


Tnlkspurt 


-45 
-40 
-35 


1.503 
1.393 
1.221 


1 . 229 
1.000 
0.739 


0.01 
0.01 


- 


- 


Pause 


-45 
-40 
-35 


1.911 
2.003 

1.885 


1.690 
1.690 
1.598 


- 


- 


- 


Double talk 


-45 
-40 
-35 


0.278 
0.247 
0.235 


0.282 
0.256 
0.211 


E 


- 


- 


Mutual 
silence 


-45 
-40 
-35 


0.452 
0.491 
0.506 


0.397 
0.441 
0.484 


— 


0.05 


- 


Alternation 
silence 


-45 
-40 
-35 


0.354 
0.403 
0.448 


0.336 
0.391 
0.464 


— 


- 


- 


Pause in 
isolation 


-45 

-40 
-35 


0.523 
0.533 
0.533 


0.453 
0.471 
0.492 


0.05 


0.05 


0.05 


Solitary 
talkspurt 


-45 
-40 
-35 


1.510 
1.377 
1.204 


1.207 
0.969 
0.706 


0.01 
0.01 


- 


- 


Interruption 


-45 

-40 
-35 


0.807 
0.821 
0.811 


0.777 
0.662 
0.578 


— 


- 


- 


Speech after 
interruption 


-45 

-40 
-35 


0.963 
0.840 
0.741 


0.770 
0.710 
0.559 


— 


- 


- 


Speech before 
interruption 


-45 
-40 
-35 


0.793 
0.648 
0.484 


0.997 
1.014 
0.S62 


0.05 
0.01 
0.01 


- 


0.05 



* Compares events at a common threshold. 

f Compares males at —40 dBm with females at —45 dBm threshold; similar for 
last column. All significance levels from /-test. 
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in Table VII. The table indicates transition probabilities for 5 msec 
time steps. For example, if both are talking, the probability is 0.98095 
that they will still both be talking 5 msec later. The conversation will, 
therefore, leave the state with p = 1.0 - 0.98095 = 0.01905. If a 
Poisson termination process* is assumed for terminating the event, 
the conversation would leave the state in 1 msec with p = 0.01905/5. 

V. OBSERVATIONS 

Events of one subject that do not involve interaction with talk- 
spurts of his partner include pauses in isolation and solitary talk- 
spurts. Data on behavior during double talking should be contrasted 

Table VII — Transition Probabilities of Changing State* 



To 
From 


Neither 


A 


B 


Both 


Neither 


0.98940 


0.00529 


0.00530 


0.00001 


A 


0.00387 


0.99486 


0.00001 


0. (K) 1-26 


B 


0.00367 


0.0 


0.99510 


0.00123 


Both 


0.00005 


0.00885 


0.01015 


0.98095 



* In a 5-msec Period for the —40 dBm Threshold Condition. 



with data on these "isolated" events rather than, for example, the 
distribution of all talkspurts, since this distribution includes events 
during double talking. 

The data are notably influenced by threshold changes. The author 
does not believe it is possible, from results reported here, to establish 
a single "correct" threshold. It is possible, however, to draw certain 
conclusions which are threshold independent (see (vi) and (vii) below, 
and Section 5.2.) 

5.1 The Data 

We know from the data that: 

(i) As the threshold is raised (speech detector made less sensitive), 
events which measure periods of talking tend to decrease in length, 
since the longer events tend to be broken up into short ones. These 



* A good discussion of Poisson and Markovian processes may be found in Ref- 
erence 8. 
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events include talkspurts, double talks, solitary talkspurts, interrup- 
tions, and speech before and after interruption. 

(ii) As the threshold is raised, events which measure periods of 
silence tend to increase in length. These events include pauses, mu- 
tual silences, alternation silences, and pauses in isolation. 

There are some individual speaker exceptions to these observations. 
For example, male No. 14 talkspurt averages are 1.683, 1.620, and 
1.759 seconds for —45, —40, and —35 dBm thresholds, respectively. 
Two other male speakers exhibit such a reversal for talkspurts. In 
general, however, conclusions drawn from the gross data are true of 
most speakers or conversations. 

(m) The distribution functions of events resulting from periods of 
talking seem in general more strongly affected by threshold shifts 
than those resulting from silences. Compare, for example, talkspurts 
vs pauses, or solitary talkspurts vs pauses in isolation. The mutual 
silence distribution, however, seems strongly influenced by threshold 
changes. 

(iv) For all events, the number of times they occur (n) is notably 
influenced by the threshold. This is particularly true of pauses in 
isolation, whose distribution remains virtually unaffected while n 
changes from 1890 to 3322 for a 10 dB threshold shift. 

(v) As the threshold is raised, the number of talkspurts tends to 
increase. This trend will obviously be reversed if the threshold be- 
comes so high that only a few spurts of energy clear it. But for low 
thresholds, as threshold is raised, long talkspurts are apparently being 
broken up into shorter segments at a faster rate than that of low 
level talkspurts being left below the threshold. 

(vi) For any particular threshold, the cumulative distributions of 
speech before and after interruption are practically identical, as seen 
from a comparison of Figs. 10 and 11. 

{vii) Interruptions tend to be much shorter than solitary talk- 
spurts, as would be expected because the interrupter might merely 
be trying to get attention rather than make a statement. Also, some 
interruptions are really not deliberate interruptions but rather ac- 
knowledgments, such as "uh huh" and "um." This effect may be seen, 
for example, at the —40 dBm threshold for which 17 percent of the 
interruptions are less than 100 msec long, while only 9.5 percent of 
solitary talkspurts are less than 100 msec. 

(viii) Many speech detectors operate with a hangover, rather than 
a fill-in, to bridge short gaps. By shifting the talkspurt distribution 
200 msec to the right and the pause distribution 200 msec left, one 
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can determine the distributions for these events which would have 
resulted if a 200-msec hangover were used instead of fill-in. However, 
the "interaction event" (double talking, etc.) distributions will be 
changed in a manner which cannot be determined from our present 
data. 

5.2 Male vs Female Speech 

Table VI shows that when male and female speech is compared at 
the same threshold, four events show a statistically significant dif- 
ference:* talkspurt, pause in isolation, solitary talkspurt, and speech 
before interruption. With the exception of pause in isolation, which 
is significant only at —45 dBm threshold, these are events resulting 
from talking rather than silence. 

Some of the apparent difference in male and female speech may 
result from a difference in average levels. The average speech level 
for the females was 5.94 dB below the average male speech level 
(measured in apl) . When male speech at —40 dBm threshold is com- 
pared with female speech at —45 dBm, and when male speech at —35 
dBm is compared with female speech at —40 dBm, thus roughly 
compensating fox the average 6 dB level difference, the significant 
differences previously observed tend to disappear. New events — pause 
in isolation, and possibly mutual silence and speech before interrup- 
tion — become significant. It thus appears not possible to completely 
eradicate differences in male and female speech with a simple level 
adjustment, although a level difference does account for differences 
observed in certain events. 

These conclusions are of particular interest in view of a recent 
study by Krauss and Brickcr, 9 who made measurements of verbal 
interaction (measured from transcripts of the conversations) when 
pairs of men and pairs of women talked over a circuit containing 
voice-operated, fixed threshold devices. The verbal behavior of the 
two sexes was significantly different in certain tasks. One wonders 
if the devices operated differently on the male and female speech, 
as they did in the present study. This could be a contributing factor 
in bringing about the behavioral difference reported by Krauss and 
Bricker. 

VI. CONCLUSION 

We hope that the publication of these data will encourage other 
researchers to make further observations leading toward a general 



* Differences are significant at ^ 0.05 level. 
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model of the speech patterns occurring in conversations. We also hope 
that by emphasizing the events surrounding double talking and other 
speaker interaction, it may be possible to draw conclusions regarding 
difficulties in conversing on certain circuits that have voice-operated 
devices. 
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